Integrating Multiple Knowledge Sources for Detection and Correction of Repairs in Human-Computer Dialog

نویسندگان

  • John Bear
  • John Dowding
  • Elizabeth Shriberg
چکیده

We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques for automatically detecting the presence of a repair, its location, and making the appropriate correction. The criteria involve integration of knowledge from several sources: pattern matching, syntactic and semantic analysis, and acoustics. I N T R O D U C T I O N Spontaneous spoken language often includes speech that is not intended by the speaker to be part of the content of the utterance. This speech must be detected and deleted in order to correctly identify the intended meaning. The broad class of disfluencies encompasses a number of phenomena, including word fragments, interjections, filled pauses, restarts, and repairs. We are analyzing the repairs in a large subset (over ten thousand sentences) of spontaneous speech data collected for the DARPA Spoken Language Program3 We have categorized these disfluencies as to type and frequency, and are investigating methods for their automatic detection and correction. Here we report promising results on detection and correction of repairs by combining pattern matching, syntactic and semantic analysis, and acoustics. This paper extends work reported in an earlier paper *This research was suppor ted by the Defense Advanced Research Projects Agency under Contract ONR N0001490-C-0085 with the Office of Naval Research. It was also suppor ted by a Grant , NSF IRI-8905249, from the National Science Foundat ion. The views and conclusions contained in this document are those of the au thors and should not be interpreted as necessarily representing the official pollcies, either expressed or implied, of the Defense Advanced Research Projects Agency of the U.S. Government, or of the National Science Foundation. tEl izabeth Shriberg is also affiliated with the Departmen t of Psychology at the University of California at Berkeley. 1DARPA is the Defense Advanced Research Projects Agency of the Uni ted States Government 56 (Shriberg et al., 1992a). The problem of disfluent speech for language understanding systems has been noted but has received limited attention. Hindle (1983) attempts to delimit and correct repairs in spontaneous human-human dialog, based on transcripts containing an "edit signal," or external and reliable marker at the "expunction point," or point of interruption. Carbonell and Hayes (1983) briefly describe recovery strategies for broken-off and restarted utterances in textual input. Ward (1991) addresses repairs in spontaneous speech, but does not attempt to identify or correct them. Our approach is most similar to that of Hindle. It differs, however, in that we make no assumption about the existence of an explicit edit signal. As a reliable edit signal has yet to be found, we take it as our problem to find the site of the repair automatically. It is the case, however, that cues to repair exist over a range of syllables. Research in speech production has shown that repairs tend to be marked prosodically (Levelt and Cutler, 1983) and there is perceptual evidence from work using lowpassfiltered speech that human listeners can detect the occurrence of a repair in the absence of segmental information (Lickley, 1991). In the sections that follow, we describe in detail our corpus of spontaneous speech data and present an analysis of the repair phenomena observed. In addition, we describe ways in which pattern matching, syntactic and semantic analysis, and acoustic analysis can be helpful in detecting and correcting these repairs. We use pattern matching to determine an initial set of possible repairs; we then apply information from syntactic, semantic, and acoustic analyses to distinguish actual repairs from false positives.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Detection and Correction of Repairs in Human-Computer Dialog

We have analyzed 607 sentences of spontaneous h u m a n compu te r speech d a t a conta in ing repairs (drawn from a corpus of 10,718). We present here cr i ter ia and techniques for au tomat i ca l ly de tec t ing the presence of a repair , i ts locat ion, and making the appropr ia te correct ion. T h e cr i te r ia involve in teg ra t ion of knowledge from several sources: p a t t e r n match...

متن کامل

Integrating multiple knowledge sources for improved speech understanding

In spoken dialog systems it is often the case that the sentence produced by the decoder with the highest recognition probability may not be the best choice for extracting the intended concepts. Lower ranking hypotheses may present better alternatives. In this paper, we show how to integrate multiple knowledge sources for the decision of selecting one of these hypotheses. A scoring schema combin...

متن کامل

The Impact of Correction for Guessing Formula on MC and Yes/No Vocabulary Tests' Scores

A standard correction for random guessing (cfg) formula on multiple-choice and Yes/Noexaminations was examined retrospectively in the scores of the intermediate female EFL learners in an English language school. The correctionwas a weighting formula for points awarded for correct answers,incorrect answers, and unanswered questions so that the expectedvalue of the increase in test score due to g...

متن کامل

Design strategies for spoken language dialog systems

The development of task-oriented spoken language dialog system requires expertise in multiple domains including speech recognition, natural spoken language understanding and generation, dialog management and speech synthesis. The dialog manager is the core of a spoken language dialog system, and makes use of multiple knowledge sources. In this contribution we report on our methodology for devel...

متن کامل

INTEGRATING CASE-BASED REASONING, KNOWLEDGE-BASED APPROACH AND TSP ALGORITHM FOR MINIMUM TOUR FINDING

Imagine you have traveled to an unfamiliar city. Before you start your daily tour around the city, you need to know a good route. In Network Theory (NT), this is the traveling salesman problem (TSP). A dynamic programming algorithm is often used for solving this problem. However, when the road network of the city is very complicated and dense, which is usually the case, it will take too long fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992